Learning to Predict Movie Ratings from the Netflix Dataset

نویسندگان

  • Dimitar Nikolov
  • DongInn Kim
چکیده

In this paper, we describe a hybrid recommendation system combining the two main approaches to recommendation collaborative filtering and content-based classification. We build a collaborative filtering framework to construct a useritem matrix of ratings and produce recommendations based on user-user similarity computed using Pearson correlation. We tackle the sparsity of the user-item matrix by incorporating a NaiveBayes classifier for each user and using it to predict the unknown ratings in the user-item matrix. We presents results from applying our approach to a movie recommendation database provided by the Netflix movie rental service.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How To Break Anonymity of the Netflix Prize Dataset

As part of the Netflix Prize contest, Netflix recently released a dataset containing movie ratings of a significant fraction of their subscribers. The dataset is intended to be anonymous, and all customer identifying information has been removed. We demonstrate that an attacker who knows only a little bit about an individual subscriber can easily identify this subscriber’s record if it is prese...

متن کامل

The Netflix Prize

In October, 2006 Netflix released a dataset containing 100 million anonymous movie ratings and challenged the data mining, machine learning and computer science communities to develop systems that could beat the accuracy of its recommendation system, Cinematch. We briefly describe the challenge itself, review related work and efforts, and summarize visible progress to date. Other potential uses...

متن کامل

A Million Dollar Reward: Accurate Online Prediction of Movie Ratings

Introduction: We explore the issues that are present in the Netflix Prize dataset. The Netflix Prize seeks to substantially improve the accuracy of user movie rating prediction based on their previous movie preferences and ratings [1]. The contest started in October 2006 and seeks to beat the current Netflix recommendation system by 10% in prediction accuracy. Though some teams have improved pr...

متن کامل

Exploring collaborative filters: Neighborhood-based approach

In this project, we study the effectiveness of collaborative filtering mechanisms in the context of the Netflix competition. We focus our attention on a dataset provided by Netflix which includes a training set with more than 100 million 4-tuples: user id, movie id, rating, and date [3]. In the first part of this project, we develop a simple model to predict future ratings of users based on the...

متن کامل

Statistical Analysis and Application of Ensemble Method on the Netflix Challenge

1. Introduction The Netflix Prize project is proposed by the Neflix Inc., in order to seek accurate predictions on movie ratings. As one group in the Stanford Netflix Prize team, our responsibility is to explore useful statistics and data curation in the training data set, and to explore ensemble methods for improving prediction accuracies. We imported the Netflix data into a MySQL database for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010